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PREFACE 


This  investigation  is  a result  of  an  effort  to  provide  AFSC, 
Wright-Patterson  A.F.3.,  Ohio  with  a computer  simulation  ainalysis 
of  the  IBM  370/155  computer  system's  throughput  performance  and 
batch-interactive-mix.  I hope  that  the  results  will  prove  useful 
to  the  Acquisition  Management  Information  System  (AMIS)  department. 

I wish  to  express  my  sincere  thanks  to  Major  Kenneth  Melendez 
of  AFIT/ENC  for  his  advice  and  leadership  as  my  advisor  and  his 
interest  in  this  effort.  I also  want  to  thank  my  wife  Linda  for 
her  emotional  support  throughout  this  undertaking. 

Alfred  H.  Linder,  III 


il 


CONTENTS 


Page 

face ii 

t of  Figures v 

t of  Tables vi 

« 

tract vil 

.Introduction  1 

Background 1 

General  Discussion 3 

Problem  Statement  4 

Scope 4 

Approach  5 

IBM  370  Operating  Character! sites  8 

System  Organization 8 

Main  Storage  9 

Central  Processing  Unit  10 

Input/ Output  10 

Byte  Multiplexer  Channels  11 

Block  Multiplexer  Channels  11 

Job  Scheduler . 12 

Multiprogramming 14 

System  2000  .........  14 

Data  Bank 15 

Data  Base  Management  System 16 

Data  Dictionary /Directory  System  16 

Data  Base  Administrator  16 

User  System  Interface  16 

. System  Requirements,  Definitions  and 

Performance  Measures  18 

Job  Processing  Sequence  18 

Workload  Parameters  20 

Performance  Measures  22 

Development  of  Simulation  Model  26 

Simulation  Steps  28 

Model  Formulation 28 

Logic  Flow 31 

Data  Preparation 33 

Prograunmlng  Techniques 33 

Validation  of  the  Simscript  Model 35 

Assumptions 35 

Steady  State 36 


38 

40 

41 

43 

43 

46 

48 

49 
51 
66 

120 


iii 


of  a simulation  model  that  will  aillow  an  analysis  to  be  performed  on  \ 


the  batch- interactive-mix  and  throughput  performance  of  the  IBM  370 I 155  | 

computer,  used  by  the  Acquisition  Management  Information  System  (AMIS)  ! 


users.  ^The  model  is  driven  from  data  obtained  from  the  AMIS  major 
production  jobs  and  hence  the  validity  of  the  results  are  not  accurate. 


In  the  recommendation  section  of  this  effort,  cluster  analysis  is  pre- 
sented as  a method  to  collect  data  about  all  the  types  of  AMIS  jobs  so  that 
a follow  on  study  could  be  made  to  achieve  valid  results  from  the 
simulation  model  that  has  been  constructed. 


The  basics  of  hardware  and  software  are  discussed,  along  with 


some  information  concerning  the  System  2000  data  base  management 
software  package.  After  the  operating  characteristics  have  been 
discussed, the  need  for  system  requirements,  definitions  and 
performance  measures  eire  presented  and  the  necessary  simulation  steps 
are  enumerated  to  show  how  the  simulation  model  was  constructed. 


The  main  variables  of  interest  (CPU  utilization  and  gain  factor) 
are  analyzed  to  determine  the  throughput  performance  of  the  system  as 
the  interactive  workload  is  increased  and  as  the  number  of  interactive 
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SIMULATION  OF  IHE  ACQUISITION  MANAGEMENT  INFORMATION  SYSTEM  I 

I 

I . Introduction  * 


A time  shoiring  system  should  provide  the  best  possible  through- 
put performance  for  batch  and  interactive  jobs.  Throu^put  is  the 
ajnount  of  useful  work  completed  per  unit  of  time  given  a particular 
workload.  (Ref.  30:16)  Even  after  a computer  system  has  been  designed 
and  implemented  it  may  be  necessary  Uo  perform  a batch-interactive-mix 
analysis  to  determine  if  the  system  is  being  used  in  such  a manner  that 
hi^  throughput  performance  is  achieved.  A computer  simulation  analysis 
of  the  operating  system  might  reveal  pertinent  information  to  aid  in  the 

determination  of  the  exact  number  of  batch  and  interactive  terminals 

* 

to  be  used  in  achieving  good  throughput  performance.  A simulation  model 
was  developed  to  analyze  the  throughput  performance  for  an  Air  Force 
System  Command  (AFSC)  organization. 

Background 

The  Air  Force  System  Commeind  (AFSC)  at  Wright  Patterson  Air  Force 
Base  uses  the  System  2000  software  package  to  store  contracts  as  they  are 
developed  from  ihe  conception  phase  through  the  completion  phase,  as  well 
as  dtiring  the  purchasing  phase  of  the  acquired  system.  There  are  approx- 
imately 75  terminals,  used  at  several  locations  across  the  United  States. 

These  are  connected  to  the  System  2000,  vrtiich  runs  on  ^ IBM  370/155  niodel 
computer. 

The  Acquisition  Management  Information  System  (AMIS)  was  implemented 
to  support  contract  administration  and  disbursement  activities.  One  of 
the  major  objectives  is  to  implement  Source  Data  Automation  (SDA)  at  the 

i 

buying  activities,  AF  Plant  Representative  Offices  (AFPROs),  and  the  , 

1 
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Air  Force  Contract  Managenen^  Cepartnent  (.^CXD)  - thus  providing  th 
H'ith  an  intaractive  capability  to  update  and  quer-y  the  central  aaua 
Throughput  performance  could  possibly  be  improved  if  a bauch- 
interactive-mix  analysis  is  performed  and  the  best  ratio  of  batch  to 
interactive  terminals  is  used  by  .AFSC.  A simulation  program  wich 
models  the  System  2000,  l/O  channels,  controllers,  memory  allocation 
technique  of  the  IBM  370,  job  scheduling,  and  outputting  of  jobs 
would  provide  results  which  can  be  used  to  determine  the  best  batch- 
interaotive-mix.  Good  throughput  performance  assures  that  the  work 
being  processed  will  be  completed  in  time  for  its  intended  use  as 
illustrated  in  Figure  1 . 


Value  of 
information 
processed 


C Conversational 


B Batch  5 
N.  High-priority 


D Deadline -driven 


Accumulated  Batch 
Low  priority 


Fig  1.  Computer  Response  Time  (From  Ref.  8: 11) 
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If  there  is  a deadline  to  getting  a job  ccnpleted  and  the 
throughput  performance  of  the  system  is  low  then  the  job  that  needs  to 
be  executed  hay  not  be  ccmpleted  in  time,  efficient  use  of  computer 
terminals  is  necessary'  to  minimise  the  exp^'nse  of  processing  data  and 
of  retrieving  meaningful  results  from  the  computer. 

General  Discussion 

Batch  and  interatxve  terminals  have  advantages  and  disadvantages 
and  a batch- interactive-mix  analysis  based  upon  the  types  of  jobs  run 
by  an  organization  will  provide  results  that  can  be  used  to  efficiently 
use  computer  terminal  resources. 

Terminals  that  provide  access  to  and  from  the  computer  are 
usually  either  batch  or  interactive.  The  term  batch  implies  that 
the  entire  program  or  data  is  routed  to  the  computer  concurrently  and 
the  results  axe  routed  back  to  the  terminal  after  the  complete  execution 
of  the  program!.  Interactive  processing  allows  small  segments  of 
data  to  be  transmitted  to  the  computer  with  results  being  sent 
back  to  the  terminal  before  more  data  is  required. 

When  a batch  job  is  submitted  through  the  terminal,  the  job  will 
be  scheduled  for  processing  according  to  the  job  size,  according  to  the 
availability  of  the  needed  peripheral  processors,  according  to  the  job 
priority,  and  in  accordance  with  any  other  constraints  the  operating 
system  designers  may  have  employed.  These  scheduling  constraints  can 
cause  delays,  especiaily  if  large  jobs  require  a large  section  of  central 
memory.  Several  small  jobs  can  tie  up  most  of  the  central  memory  available, 
and  when  one  job  finishes  there  still  may  not  be  enough  central  memory 
available  for  the  extremely  large  job  that  has  been  waiting  in  the 
job  queue.  This  will  force  the  job  scheduler  to  by-pass  the  larger  job 
because  a smaller  job  in  the  queue  is  available  for  processing.  If  the 
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large  job  has  a hi^  enou^  priority,  some  of  the  smaller  jobs  might  be 
rolled  out  of  central  memory  to  allow  the  larger  job  enough  memory  to 
begin  execution. 

Scheduling  batch  jobs  can  become  complex.  "Its  /"scheduling 
system_7  objective  is  to  select  jobs  from  all  jobs  available  to  provide 
a well-balanced  mix,  thus  enabling  the  computer  to  use  its  resources 
efficiently,  and  at  the  same  time  meet  all  its  deadlines  (Ref.  3j27)." 

Use  of  interactive  terminals  can  minimize  these  scheduling  problems 
because  their  demands  for  processing  is  recognized  almost  instantaneously 
due  to  the  higher  priority  of  interactive  jobs.  Howeve  this  creates 
a new  significant  problem.  "With  interactive  processing  . . . system 
resoxirces  must  always  be  fully  prepared  to  handle  a worst  case  situation 
(i.e.,  amoimt  of  memory  needed)."  (Ref.  8:l4) 

A single  computer  can  handle  hundreds  of  terminal  connections, 
and  the  number  of  batch  terminals  versus  the  number  of  interactive 
terminals  can  significantly  alter  the  operational  costs  of  using  the 
computer.  The  approximate  ratio  of  batch  to  interactive  terminals 
that  should  be  used  is  dependent  upon  the  computer  system  and  the  needs 
of  the  organization  the  computer  services. 

Problem  Statement 

AFSC  is  interested  in  the  results  obtained  from  batch-interactive- 
mix  analysis  performed  on  the  AMTS  jobs  run  on  the  IBM  370.  The  results 
might  aid  in  making  operational  changes  that  will  increase  the  throughput 
performance  for  AMIS  users. 

Scope 

The  objective  of  this  investigation  is  to  build  a computer  simulation 
model  of  the  IBM  370  and  System  2000  which  will  allow  a batch-interactive- 
mix  analysis  to  be  performed  and  the  results  presented  to  APSC.  The  model 
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will  be  driven  by  estimates  of  the  workload  characteristics  of  the 
AMIS  jobs.  A section  is  presented  dealing  with  obtaining  valid  input 
data  which  will  lead  to  more  accurate  results  from  the  simulation  analysis. 
Approach 

One  way  to  evaluate  and  determine  if  an  organization  is  achieving  near 
optimal  throu^put  perfoirmance  by  using  the  best  batch-interactive-mix,  is 
to  construct  a computer  model  of  the  system  being  used  and  to  use  input 
data  that  is  representative  of  jobs  being  run.  When  large  samples  of 
workload  characteristic  data  is  needed,  a computer  simulation  model  will 
help  the  analyst  arrive  at  meaningful  benchmarks.  The  sheer  volume  of 
data,  needed  to  drive  the  simulation  model,  would  rule  out  the  use  of 
achieving  the  analysis  by  use  of  a mathematical  probabilistic  model. 

Simscript  II. 5 is  a.  computer  language  that  is  ideal  for  simulating 
the  computer  operating  system  because  it  allows  the  user  to  schedule 
events  when  a pre-determined  simulation  time  has  elapsed.  Results 
from  the  Simscript  computer  simulation  run  can  be  used  to  evaluate  the 
performance  of  the  system.  System  performance  actually  involves  two 
primary  considerations.  The  first  is  the  effectiveness  of  how  the  system 
handles  a specific  job  or  request;  the  second  involves  efficiency  or 
how  the  system  uses  the  resources  avadlable.  Understanding  both  of 
these  considerations  is  essential  when  analyzing  the  performance  of  the 
system.  The  most  efficient  use  of  system  resources  may  not  provide  the 
best  throughput  performance  for  a specific  job,  but  it  certainly  will 
provide  good  throughput  performance  over  a given  time  period,  with  a 
substantially  reduced  operational  cost. 

The  computer  simulation  model  must  be  capable  of  handling  certain 
workload  parameters  such  atsi  job  CPU  time,  job  l/o  request,  CPU  service 


time,  l/O  service  time,  interaxrival  time,  priority,  memory  requests, 
number  of  simultaneous  users,  number  of  jobs  in  the  system,  etc.  (Ref,  298 
12-13) . Without  understanding  these  workload  parameters,  or  using  them 
improperly  in  the  simulation  model,  the  performance  evaluation  results 
will  be  invaJJ-d  or,  at  best,  somewhat  misleading.  Workload  character- 
ization of  jobs  and  tasks  should  be  fairly  accurate  if  the  workload 
parameters  are  representative  of  the  jobs  run  by  the  organization, 

Workload  paraimeters  can  be  combined,  consequently  producing  data  that 
is  representative  of  the  jobs  and  tasks.  Throu^  a clustering  analysis 
tech'iique  the  analyst  produces  data  that  is  consistent,  regardless  of 
the  possible  extreme  values  of  a given  parameter.  According  to  Anderberg, 

"Cluster  analysis  has  been  employed  as  an  effective  tool  in 
scientific  inquiry.  One  of  its  most  useful  roles  is  to  generate 
hypotheses  about  category  structure.  An  algorithm  can  assemble 
' observations  into  groups  which  prior  misconceptions  and  ignorance 
would  otherwise  preclude.  An  algorithm  can  also  apply  a principle 
of  grouping  more  consistently  in  a large  problem  than  can  a human. 

. . . cluster  analysis  may  be  used  to  reveal  structure  and  relations 
in  the  data.  It  is  a tool  of  discovery.  (Ref.  3:4) 

Developing  a simulation  model  vrtiich  uses  correct  workload  parsuneters  and 
valid  input  data  which  allows  the  analyst  to  examine  the  systems  performance 
is  more  than  an  intuitive  eirt.  Shannon  /JsJ  has  suggested  some  criteria 
that  are  instrumental  in  developing  a successful  simulation  model.  The 
11  suggested  steps  will  help  the  simulation  designer  during  the  initial 
state  of  "system  definition"  through  the  "documentation"  phase  of  the  design. 
(Ref.  28:23)  These  steps  will  be  discussed  in  detail  in  Chapter  3. 

Understanding  the  system  and  analyzing  the  operation  were  important 
in  building  the  Simscript  II. 5 model  of  the  AMIS  System.  This  enhances 
the  development  of  a workload  analysis  methodology.  Chapter  2 deals  with 
the  operating  characteristics.  Chapter  3 discusses  the  job  processing 
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sequence  ajid  performance  measures  while  Chapter  4 discusses  the  model 


formulation,  logic  flow  and  data  preparation.  The  first  phase  of  a 
performance  improvement  effort  is  that  of  understanding  the  computer 
operating  system  and  the  logical  structure  of  the  computer,  which  is 
the  topic  of  the  next  chapter. 


rf 


i 

s ; 


II  I3H  370  Cperatir.g  ChaLracteristics 

System  Organ! zaticr. 

"The  IBM  system  370  model  155  is  a hig,h-per:'ormance  data  processing 
system  that  provides  the  rsliaoility,  availability,  and  convenience 
demanded  by  business  and  scientific  users,  as  well  as  by  users  with 
applications  in  commiunications  or  control."  (Ref.  IBM  3ystem/370 
System  Summary;  6-9) 

The  following  sections:  system  organization,  system  control  and  System 
2000  will  be  described  using  information  from  the  IBM  System/370  System 
Summary,  IBM  System/370  Principles  of  Operation  and  System  Operation  and 
Guide  to  Data  Base  Mangement. 


Fig  2.  IBM  System/370  Logical  Structures  (From  Ref.  21; 13) 


The  single  CPU  system  logical  stracture  is  ccmprised  of  main  storage,  a 
central  processing  unit,  a selector  and  multiplexer  channels  which 
communicate  with  each  other  'ey  connecting  paths. 

Main  Storage 

Both  data  and  program  must  be  loaded  into  main  storage  before 
processing  is  allowed.  The  main  storage  is  set  up  to  provide  direct 
addressable  fast-access  storage  of  data.  The  main  storage  is  comprised 
of  a large-volume  access  buffer  called  a cache. 

In  this  type  of  a buffer  storage  system  a Storage  Control  Unit 
(SCU)  is  placed  between  the  processor  and  main  storage  unit  and  is 
illustrated  in  figure  3. 


Buffer  storage  ("cache") 


Main  Storage 


Storage  Control 
Unit  (SCU) 


Fig  3.  Organization  of  buffer  memory  hardware  (From  Ref.  14; 192) 

Exact  copies  of  main  memory  (32  bytes  at  a time)  are  brought  into  the 
cache  whenever  a reference  is  made  to  an  address  that  is  not  already 
existing  in  the  cache.  A block  is  stored  over  the  existing  block  in  the 
cache.  Whenever  a fetch  request  is  generated,  a check  is  performed 
to  determine  if  there  is  need  to  bring  in  a different  block  from  main 
memory.  This  algorithm  is  simple  enough  to  be  built  into  the  hardware 
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of  the  system  and  allows  a much  faster  fetch  request  from  "copied" 
portions  of  main  memory  which  actually  reside  in  the  buffer  storage 
"cache"  unit. 

Central  Processing  Unit 

The  CPU  is  the  controlling  center  of  the  computer  system  and 
handles  system  related  functions  such  as  execution,  interrupts, 
initial  program  loading,  etc.  The  CPU  will  hamdle  5 basic  classes  of 
instructions:  system  control,  general  decimal,  floating  point,  and 
input/output  instructions.  The  basic  instruction  sets  include  decimal, 
floating  point,  extended  precision  floating  point,  direct  control,  byte 

oriented  operand,  dynamic  address  translation,  extended  control  mode, 
and  store  status  and  program  reset. 

Input/ Output 

Data  is  transferred  between  devices  and  main  storage  by  attaching 
the  l/O  devices  to  channels,  and  the  actual  communication  between  control 
units  and  the  specific  channel  takes  place  over  a connector  called  the 
l/O  interfane.  l/O  devices  such  as  card  reader,  punches,  magnetic  tape 
units,  disc  storage,  drum  storage,  typewriter-keyboard  devices,  printers, 
teleprocessing  devices  and  sensor-based  equipment  are  handled  by  the 
associated  l/O  Interface  which  provides  information  format  and  control 
signal  sequences.  l/O  devices  fall  into  several  categories,  most 
of  which  are  used  for: 

Auxiliary  storaige. 

Machine  and  manual  (keyed)  input,  both  local  and  remote. 

Teleprocessing . 

Reading  (or  output)  of  external  documents  and  displays. 

process  control. 

and  data  acquisition. 
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"An  Input/ Output  operation  transfers  data  between  main  storage 
and  an  I/O  device.  An  l/O  operation  is  initiated  by  a prograjn 
instruction  that  generates  a command  to  a chauinel.  A control  unit 
receives  the  commajid  via  the  l/O  interface,  decodes  it,  and  starts 
the  l/O  device.  (Ref.  21:2~7) 

Byte  Multiplexer  Channels 

Channels  are  the  direct  controllers  of  l/O  devices  and  control  units, 
and  allow  the  system  to  read,  write  and  compute.  This  is  accomplished 
while  relieving  the  CPU  of  having  to  communicate  directly  with  the  l/O 
devices.  "The  byte  multiplexer  channels  separate  the  operations  of 
high-speed  devices  from  those  of  lower-speed  devices. "(Ref . System 
Summary!  2-7)  This  allows  channel  communication  to  tajce  place  in  one 
of  two ■ different  modes;  the  byte  mode  using  the  slower  data  rate  l/O 
devices  and  the  burst  mode  using  the  higher  data  rate  l/o  devices. 

Vfhen  operating  in  the  byte  mode,  the  single  data  path  can  be 
used  by  several  low  speed  l/O  devices  (such  as  card  readers,  printers 
and  terminals)  and  each  device  takes  turn  in  sending  data  over  the 
multiplexed  line.  This  is  accomplished  >d:ien  the  channel  receives  and 
sends  data  to  the  l/O  on  demand  and  is  controlled  by  the  channel  program. 

In  the  burst  mode,  l/o  devices  (such  as  magnetic  tape  units,  discs, 
or  data  cell  storage)  are  not  under  the  control  of  the  programmer  and 
after  these  high-speed  devices  have  established  a logical  connection 
with  a channel,  large  amounts  of  data  can  be  transferred  in  "bursts" 
irtiich  are  not  multiplexed. 

Block  Multiplexer  Channels 

The  block  multiplexer  channels  operate  high-speed  l/O  devices  on 
a single  data  path  and  can  also  operate  in  one  of  two  modes i block 
multiplex  or  selector  mode.  In  the  block  multiplex  mode  the  channels 
permit  interleaving  of  channel  programs  and  allow  initiation  and 
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termination  of  these  programs  to  occ’jr  sooner  than  is  allowed  by 
selector  channels.  The  block  multiplexing  mode  allows  more  data  to  be 
transferred  during  its  burst  mode  than  can  be  routed  by  the  byte  burst 
mode.  Entire  records  and  blocks  can  be  transferred  during  each  burst 
so  that  block  multiplex  channels  are  used  with  faster  l/O  devices  than 
are  used  by  the  byte  multiplexor  channels. 

When  operating  in  the  selector  mode,  l/o  devices  are  attached  to 
the  selector  channel  which  transmits  data  to  or  from  a single  l/O  device 
at  a time.  These  select  channels  can  be  operated  with  either  the  slow 
or  high  speed  devices  but  are  especially  suitable  with  high  speed  l/O 
devices.  Once  a selector  channel  has  attached  a paurticular  l/O  device 
it  will  transmit  data  until  all  data  has  been  handled  and  no  other  l/O 
device  can  interfere  with  the  selector  channel. 

Information  about  the  hardware  configuration  has  been  presented' and 
it  is  hoped  that  the  reader  is  a little  more  familiar  with  the  computer 
system.  General  information  concerning  the  software  of  the  system  is 
discussed  now  to  further  aid  in  the  understanding  of  the  system. 

Job  Scheduler 

All  jobs  are  classified  into  11  membership  classes.  The  requirements 
for  each  class  is  listed  in  table  1 . where  K is  the  region  size  in  memory 
(in  blocks  of  1024  bytes)  and  T is  the  CPU  time  in  minutes.  The  job 
class  represents  the  programmer's  estimate  of  the  resources  required 
for  the  job  auid  is  used  to  schedule  and  prioritize  jobs.  The  following 
classes  are  used  on  the  IBM  370/155  A.S.D. 
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Table  1. 

Scheduling  Priorities 


Class 

Region 

Time 

Tape  Mounts  Disk  Mount 

Permitted  Permitted 

A 

K<;200 

Ti?5 

NO 

NO 

B 

KS260 

T^O 

NO 

NO 

and 

(K  200  or  T 5) , 

I.E.,  JOB 

CANNOT  BE 

RUN  IN  CLASS  = A 

C 

260<Kl500 

Tif60 

NO 

NO 

D 

K<200 

TiS5 

YES 

NO 

E 

K<260 

't£60 

YES 

NO 

and 

(K  200  cr  T 5). 

I.E.,  JOB 

CANNOT  BE 

RUN  IN  CUSS  = D 

F 

260-<K£500 

Ti60 

YES 

NO 

G 

K«200 

Tt5 

YES 

YES 

H 

K£260 

'K60 

YES 

YES 

and 

(K  200  cr  T 5), 

I.E.,  JOB 

CANNOT  BE 

RUN  IN  CLASS  = G 

I 

26OcK^500 

TS60 

YES' 

YES 

J 

KS.500 

T^O 

YES 

NO 

K^O  TSSO  YES  YES 
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Multiprogramming 


Once  a job  has  been  assigned  a class  priority  it  is  also  ranked 
by  priority  within  that  class.  The  highest  priority  job  will  always 
have  access  to  the  CPU  -onder  this  multiprogramming  scheme.  Once  a 
job  has  accessed  the  CPU  it  will  continue  being  processed  'jntil  the 
job  must  relinquish  the  CPU  because  of  an  l/o  or  until  a higher 
priority  job  has  been  scheduled  or  a higher  priority  job  has  completed 
its  l/O.  Ihe  job  that  was  interrupted  is  put  into  a wait  state  and 
must  compete  once  again  with  aJLl  jobs  available  to  be  processed  in 
accordance  with  the  priority  established  in  Table  1.  The  amount  of 
l/O  a job  performs  does  not  affect  the  priority  algorithm  and  this 
multiprogramming  scheme  works  well  if  the  higher  priority  jobs  are 
l/O  bound  while  the  lower  priority  jobs  are  CPU  bound. 

Because  there  is  no  time  slicing,  the  lower  priority  jobs  must 
wait  until  all  the  higher  priority  jobs  are  handling  l/O  and  this  means 
that  these  jobs  will  be  waiting  for  long  periods  of  time  in  between 
opportunities  to  access  the  CPU.  CPU  utilization  will  usually  be 
high  with  good  turnaround  time  for  the  higher  priority  jobs.  Degredation 
of  turnaround  time  will  occur  for  the  lower  priority  jobs.  Fortunately 
the  AMIS  jobs  perform  a lot  of  l/Os  when  run  on  the  System  2000  data  base 
management  system.  The  large  number  of  l/Os  allow  the  lower  priority  jobs 
to  frequently  access  the  CPU. 

System  2000 

The  system  2000  software  package  supports  the  AMIS  requirement  of 
storing  and  modifying  contracts  for  Air  Force  Systems.  This  system 
simplifies  the  actual  storing  and  accessing  of  contract  related  data. 


"A  data  base  is  generally  acknowledged  to  be  a collection  of 
multiple  logical  files  cor.oaining  interrelated  bat  ncr.red--;-.dant 
data  •'Thich  can  be  accessed  by  one  or  nore  applications . A iaoa 
base  nanagenene  syster.  is  a scfowaore  cool  — ac'-ually  a collec- 
tion of  routine  — used  oo  define  and  nainoain  one  daoa  base's 
logical  structure,  and  provide  a r.eans  by  which  data  can  be 
retrieved."  :,,Ref.  -i-:!) 

The  system  2000  data  case  is  comprised  of  5 functional  elements; 

Data  Bank,  Data  Dictionary/Directory  System,  Data  Baise  Admi.nistra- 
tor.  Data  Base  Management  System  and  the  User  System  Interface.  The 
inter-relationship  between  these  functional  elements  is  shown  in 
Figure  4. 


User 

System 

Interface 


Data  Base 
Management 
System 


Fig  4.  Data  Base  Management;  ConceptuaJ.  Environment  (From  Ref.  4; 5) 


Data  Bank 


The  data  bank  is  comprised  of  a collection  of  data  bases  organized 
to  provide  maximum  performance  of  the  system.  The  jiiysical  location  of 


these  data  bases  is  not  important  as  each  is  logically  connected  with 
the  Data  Dictionary/Directory  System  which  is  caipable  of  coordinating 
centralized  or  decentralized  data  base  files.  Querries,  updates, 
referencing,  etc.  can  access  the  Data  Bank  through  the  Bata  Base 
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Management  System. 

Data  Base  Management  System 

The  Data  Base  Management  System  consults  the  Data  Dictionary/ 

Directory  System  ”...  for  information  about  data  to  provide  the 
parameters  necessary  for  the  generalized  software/hardware  to  execute 
user  request."  (Ref.  4:7)  The  Data  Base  Management  System  is  the 
actual  software  package  for  accessing  and  storing  data. 

Data  Dictionary/Directory  System 

The  Data  Dictionaxy/Directory  System  has  two  objectives.  The 
first  is  the  collection  and  dissemination  of  data  i^diich  supplies  the 
user  with  meta-data  (data  about  data).  The  second  function  of  the 
Data  Dictionary/Directory'  System  is  that  of  establishing  standards 
for  coding  conventions,  data  naming  and  usage. 

Data  Base  Administrator 

The  major  responsibilities  of  the  Data  Base  Administrator  (Ref.  4; 6) 

ares 

Definition  of  the  content  and  structure  of  the  data  base. 

Control  of  data  access  and  modification  rights  to  the  data  base. 

Advising  data  base  users  on  efficient  techniques  for  extracting  data. 

Establishing  data  entry,  edit,  and  validation  standards. 

Maintenance  of  the  Data  Dictionary/Directory  System. 

Maintenance  of  the  Data  Base  Management  System, 
and  keeping  track  of  available  physical  storage. 

User  System  Interface 

To  adequately  serve  its  users  the  data  base  must  use  interfaces 
that  allow  for  3 main  areas  in  the  User  System  Interface  Support. 

j 
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The  first  is  the  lan^age  capability.  The  use  of  a natural  language 
will  help  the  user  formulate  requests  and  problem  definitions  that  are 
easier  to  use  when  communicating  with  the  system. 


t 
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The  second,  "Interactive  Capability  — the  ability  to  browse  in 
search  of  solutions  supports  the  user  decision  process  by  providing 
the  means  to  develop  new  alternatives."  (Ref.  4;?)  The  last  is  the 
"Auxiliary  Subsystems  — use  will  provide  subsystems  to  assist  users 
in  putting  system  output  into  the  most  humanely  comprehensible  form. 

This  category  includes  graphic  techniques,  algorithmic  processes, 
and  modeling  and/or  simulation  tools."  (Ref.  4;?) 

The  first  phase  in  improving  the  performance  of  the  computer  system 
is  that  of  that  of  understanding  the  system  in  terms  of  hardware 
configuration  and  related  software  programs  in  use.  The  second  phase 
involves  the  analyzing  of  operations  and  understanding  the  system 
requirements,  definitions  and  performance  measirres. 
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Ill  System  Requirements,  Definitions  and 
Performance  Measures 


Job  Processing  Sequence 

The  system  simulation  program  that  has  been  constructed  deals 
with  the  following  basic  job  processing  sequence  as  outlined  by 
MacDougall  (Ref.  26 j 191-192).  The  simscript  programi  itself  is 
located  in  Appendix  B. 

1 . Job  arrival. 

2.  Request  for  central  memory  (CM)  - if  available,  allocated,  if  not, 
the  job  is  entered  into  the  CM  queue. 

3.  Request  for  central  processor  - if  available,  the  job  is  assigned 
to  it  and  executes  until  an  l/o  request  is  encountered  or  until  execu- 
tion completes  or  a higher  priority  job  interrupts  the  CPU.  If  the 
central-  processor  is  not  available  then  the  job  is  entered  into  a 

CPU  queue. 

4.  Request  for  an  l/O  - the  job  is  released  from  the  central  processor 
and  if  another  job  is  waiting  for  the  CPU,  the  job  waiting  will  be 
assigned  to  the  CPU.  If  the  disc  is  free,  the  disc  is  assigned  to  the 
job  to  process  the  l/O  request;  if  the  disc  is  busy,  the  request  is 
entered  in  a queue. 

5.  On  completion  of  processing  of  an  l/O  request,  the  disc  is  released 
and  the  central  processor  requested  once  again.  (When  the  disc  is 
released,  the  disc  queue  is  checked;  if  there  is  a Weiiting  request, 

it  is  assigned  to  the  disc.) 

6.  When  a job  completes  execution,  it  releases  the  central  processor 
and  its  central  memory  space  is  released.  (The  CM  queue  is  checked  to 
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determine  if  there  is  a job  waiting  to  which  space  now  can  be  assigned, 
and  the  GPU  queue  checked  to  determine  if  there  is  a waiiting  job 
which  now  can  be  assigned  to  the  central  processor.) 

7.  The  job  leaves  the  system  auid  the  sequence  is  repeated. 


The  above  processing  sequence  is  modified  slightly  for  interactive 
jobs  because  the  interactive  job  can  be  thought  of  operating  in  either 
in  one  of  two  states.  (Ref.  9!l72)  The  think  state  is  the  time 
between  a computers  response  to  the  terminal  request  and  when  another 
request  is  made  by  the  user,  such  as  the  hitting  of  the  carriage 
return.  The  system  state  is  the  time  from  the  users  request  until 
the  computer  has  processed  that  request.  When  the  processed  request 
has  been  completed,  an  "interaction"  has  occurred  and  the  interactive 
process  has  once  again  entered  the  think  state. 

Buzen  has  verified  that  the  average  response  time  can 

be  calculated  by  the  following  formula; 


N 

k=l 

where , 

" R = Average  response  time  (i.e.,  average  amount  of  time  in  system  state 
per  interaction) . 

r(k)=  total  time  that  the  k-th  interactive  process  (i.e.,  the  inter- 
active process  associated  with  the  k-th  terminal)  spends  in  system 
state  during  the  observation  interval  (k=l,2, . . .N) ." 

J = Number  of  interactions  completed  during  the  observation  time 
interval . 
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Buzen  also  claims  that  the  average  think  time  can  be  calculated 


oy; 


Z = 


J’ 


z(k) 


(2) 


k=l 


Where, 

” Z = Average  think  time  (l.e.,  average  amount  of  time  in  think 
state  per  transition  think  state  to  system  state) . 

J'  = total  number  of  transitions  from  think  state  to  system  state 
during  the  observation  interval. 

z(k)  = total  time  that  the  k-th  interactive  process  spends  in  think 

state  during  the  observation  interval  (k=l,2, . . .N) (Ref.  9sl72-173) 
Since  this  interactive  process  occurs,  the  simulation  model  must 
account  for  this  response  time,  and  during  the  system  state  points  3 
through  5 of  the  job  processing  sequence  justify  considering  the 
interactions  as  "separate  jobs". 

Workload  Parameters 


Within  the  basic  job  processing  sequence  the  simulation  program 
handles  the  following  workload  parameters  as  modified  from  the  list 
compiled  by  Svobodova.  (Ref.  29? 12-13) 

Job  Cpu  Time  Total  Cpu  time  requested  by  the  job  or  CPU 


time  requested  by  a single  interactive  command. 


CRT  Service  Time 


l/O  Service  Time 


Interaxrival  Time 


Priority 


Blocked  Time 


Memory  Requests 


User  Response  Time 


Number  of  Simultaneous 
users 


Number  in  the  system 


Time  required  to  process  a single  GPU 
operation . 

Time  to  complete  or  process  a single 
l/O  task. 

Time  between  two  successive  requests 
for  any  given  resource. 

Priority  aussigned  to  a job  by  the 
algorithm  shown  in  Table  1 . 

Time  the  job  must  wait  for  the  CPU 
service  in  a wait  state. 

Amount  of  core  required  by  the  individual 
job. 

Time  it  takes  the  user  to  submit  another 
request  after  a response  from  the  CPU. 

The  total  number  of  interactive  users 
concurrently  logged  on. 

Total  number  of  batch  and  interactive  jobs 
operating  within  the  system. 


The  Simscript  progreLmming  languaige  allows  an  entity  to  possess  certain 
attributes  which  allow  the  simulation  program  to  "move"  a batch  or 
interactive  job  "through"  the  system,  carrying  with  it  the  necessary 
workload  paraimeters  to  analyze  the  job  workload  characteristics.  As  the 
job  moves  through  the  system  - enters  queues,  is  assigned  to  the  central 
processor,  etc.  - the  job  cairries  with  it  all  the  information  needed 
throughout  the  simulation  process.  Once  the  "job  output  simulation" 
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occurs  the  individual  job  is  destroyed  in  the  system  but  the  statistics  about 
the  job  are  kept  by  use  of  a TALLY  3TAIEMENT.  "The  Tally  Statement 
computes  statist' cal  q,uantities  and  prepares  histograms  for  time- 
independent  variables. "(Hef.  Simscript  Manual)  This  allows  the  simulated 
job  to  be  destroyed  while  specific  data  collection  counters  and  routines 
to  compute  statistical,  quantities  are  used  in  conjunction  with  an  ACCUMULATE 
statement  t'o  determine  statistical  auialysis  of  the  collective  job  workloaui 
characteil. zations . These  statistics  aure  useful  in  determining  some 
important  system  performance  measures,  such  as  those  listed  by  Svobodova. 
(Ref.  29:16-18) 

Performance  Measures 

Throughput  is  a good  measure  of  system  responsiveness  aind  since 
it  is  the  cimount  of  useful  work  completed  within  a given  time  period, 
given  a particular  workload,  there  are  several  performance  mea^sures 
that  will  give  an  indication  of  the  system  throughput  performance. 

The  turnaround  time  is  the  elapsed  time  between  submitting  a job  and 
the  time  the  results  are  received  at  the  printer  or  terminal.  The 
turnaround  time  can  be  a good  indication  of  throughput  performance. 

The  elapsed  time  multiprogramming  factor  (STMF)  is  a numerical  value 
calculated  by  dividing  the  turnaround  time  of  a particular  job  run 
under  multiprogramming  by  the  turnaround  time  of  the  job  had  this 
been  the  only  job  in  the  system.  The  ETMP  is  a measure  of  only  one 
job  in  the  system  where  gain  factor  is  a measure  of  several  sequential 
jobs. 

The  gain  factor  is  determined  by  finding  the  time  needed  to 
execute  a set  of  jobs  under  multiprogramming  divided  by  the  total 
system  time  needed  to  execute  the  same  jobs  sequentially  without  the 
capability  of  multiprogramming.  Since  types  of  jobs  programmed  may 


vary  considerably  from  day  to  day  and  during  different  time  periods 
within  a day,  it  is  importamt  that  the  gain  factor  be  based  upon  results 
obtained  from  different  time  periods.  Another  measure  of  system 
performance  is  the  CPU  productivity  or  GPU  utilization. 

The  CPU  productivity  is  the  percent  of  time  the  CPU  is  in  use 
performing  useful  work.  Multiprogramming  shotild  allow  the  system  to 
perform  with  high  CPU  utilization  because  as  one  job  has  completed  a 
CPU  task  and  relinquished  the  CPU  for  an  l/o  there  will  usually  be  a 
job  that  has  been  waiting  for  the  CPU  in  the  CPU  queue.  Of  course 
there  is  some  overhead  time,  which  is  CPU  time  required  by  the  operating 
system.  GFJ  productivity  could  be  very  low  if  the  orgauiization  jobs 
are  consistently  l/o  bound. 

The  wait  time  for  l/O  is  the  time  necessary  to  process  an  l/O 
task  and  therefore,  if  all  the  multiprogrammed  jobs  are  involved  in 
l/O  tasks,  the  GPU  will  be  idle  until  one  of  the  jobs  completes  its 
l/O.  If  this  occurrance  of  l/O  bound  jobs  is  common,  there  may  be 
many  times  the  CPU  is  idle  and  hence  the  CPU  productivity  will  be 
low.  If  the  CPU  productivity  is  constantly  low  then  the  availability 
of  the  system  will  be  high. 

The  availability  of  a system  is  the  percentage  of  time  the 
system  is  available  to  the  users.  Low  availability  will  cause  the 
external  delay  factor  to  be  high  (job  turnaround  time/ the  total  CPU 
processing  time  required)  aind  the  throughput  to  be  low. 

A computer  simulation  model  that  uses  good  workload  parameters  and 
is  ca^^able  of  analyzing  throughput  performance  with  adequate  performance 
measures  will  yield  res\alts  that  will  help  to  remedy  poor  CFJ  utilization 
and  improve  the  turnaround  time  for  the  organizational  users.  A reliable 


and  accurate  model  is  constructed  by  following  the  steps  in  the 
flowchart  of  Figure  5*  (Ref.  28:24) 

Development  of  Simulation  Model 

The  flow  chart  on  paiges  24  and  25  shows  the  necessary  stages  that  must 
be  considered,  and  implemented  in  order  to  achieve  the  construction  of 
a reliable  simulation  model.  The  11  steps  that  Shannon  has  Incorporated 
into  this  process  will  be  described  in  an  abbreviated  format.  (Ref.  28:23) 

1)  System  Definition  - The  Simscript  II. 5 model  was  developed  within 
the  boundary  of  considering  the  CPU  time,  l/O  time,  priority  of  the 
job  and  the  need  for  disc  or  tapes.  The  measures  of  effectiveness 
of  the  model  include:  job  turnaround  time,  CPU  utilization,  and 
gadn  factor. 

2)  Model  Formulation  - Reduction  of  the  real  system  to  be  simulated 
into  a logic  flow  diagram.  This  is  shown  in  Figure  6a  and  6B. 

3)  Data  Preparation  - Identification  of  data  needed  by  the 
Simscript  model  which  is  described  on  page  33. 

4)  Model  Translation  - Description  of  the  model  in  the  Simscript  language. 

5)  Validation  - Increasing  the  level  of  confidence  in  the  model  and 
the  results  the  model  generates.  This  is  discussed  on  page  35* 

6)  Strategic  Planning  - Design  of  an  experiment  that  shows  how  the 
gain  factor  and  CPU  utilization  varies  with  a change  in  the 
workload  of  the  system. 

7)  Tactical  Planning  - Determining  how  the  two  experiments  are  to  be 
run  and  tested. 

8)  Experimentation  - Execution  of  the  simulation  model  to  perform 
these  experiments  and  to  perform  a sensitivity  analysis  of  the 
system.  The  sensitivity  of  the  model  is  presented  on  page  38  and  a 
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discussion  of  the  experiments  is  located  on  page  40. 

9)  Interpretation  - Drawing  inferences  from  the  data  gathered  from 
the  experimentations.  This  will  be  done  in  the  conclusion  section 
of  this  paper. 

10)  Implementation  - The  results  of  these  experimentations  will  be 
presented  to  AFSC. 

11)  Documentation  - The  results  of  the  experimentations  are  listed  in 
Appendix  A. 

The  formulation  of  the  problem  has  been  presented  in  the  Introduction 
while  the  system  definition  was  presented  in  Chapter  3*  It  is  now  time 
to  consider  the  use  of  a simulation  language  and  to  look  at  what  is 
involved  in  the  Model  Formulation  stage  of  the  simulation  process. 


IV  simulation  Steps 


Model  Formulation 

Having  discussed  the  problem  to  be  Investigated  and  showing  the 
need  for  a computer  simulation  analysis,  Shannon  (Ref.  28:24)  suggests 
that  the  next  step  concerns  the  reduction  of  the  real  system  to  a logic 
flow  diagraja  that  accurately  depicts  the  process  to  be  simulated.  This 
flow  approach  to  analyzing  the  IBM  370/155  will  allow  the  workload 
parajaeters  auid  performaince  measures  to  be  incorporated  into  the  simulation 
model.  Figure  6 on  pages  29  and  30  is  a flow  diagram  of  the  necessary 
events  to  simulate  the  job  processing  of  batch  and  interactive  jobs. 

Once  the  real  system  is  represented  by  logic  flow,  the  necessary 
data  input  and  starting  conditions  are  determined.  Some  data  genera- 
tion is  internal  to  the  program  such  as  pseudorandom  numbers  and  stochastic 
variates  and  must  be  generated  appropriately.  Once  it  is  known  how 
the  real  system  should  be  represented,  it  is  important  that  a simulation 
language  be  chosen  that  will  allow  a meaningful  discription  to  be 
implemented  in  a computer  simulation  program.  Unfortunately,  the 
best  simulation  language  for  a particular  investigation  is  overlooked 
simply  because  the  analyst  may  be  familiar  with  a specific  language 
already  and  may  feel  that  too  much  time  and  effort  would  be  needed  in 
determining  which  language  is  best  and  then  having  to  learn  that  new 
language . 

Heidenreich  and  Blitt  (Ref.  7:38)  have  listed  the  auivantages 
and  disadvantages  of  different  languages  by  comparing  9 languages 
including  Slmscript  and  GPSS  simulation  languages.  Their  comparison 
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Fig  6a.  Job  Processing  Sequence 
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suggests  that  Slmscript  is  designed  for  simulation  because  it  is  event 
oriented  but  is  slow  to  execute  and  not  well  known  by  its  users.  GPSS 
is  also  a simulation  language  but  appears  to  be  easier  to  learn  than 
Simscript.  A prime  consideration  between  the  languages  mi^t  be  simply  , 

’’’rfhich  une  is  available  to  the  investigator?".  Simscript  was  choosen 
because  it  is  suited  to  general  programming  and  discrete-event  simulation  ' 

modeling,  and  because  of  familiarity  with  the  language. 

Logic  Flow 

Again,  figure  6.  shows  the  logic  flow  for  the  continuous  simulation 
model  which  represents  the  AKES  system  employed  on  the  IBN  370/155* 

The  actual  program  is  listed  in  Appendix  3.  and  a description  of  the  ; 

logic  flow  will  coincide  with  the  events  in  the  program  model. 

The  "Main"  routine  is  used  to  establish  initialization  of  variables 
and  to  schedule  the  first  interactive  job  and  the  first  6 batch  requests. 

I 

All  simulation  time  is  based  upon  parts  of  a Da-y,  so  all  units  must  be  j 

i 

established  here  in  "Main"  to  avoid  errors  that  result  from  working  with  ] 

wrong  units.  Once  the  variables  have  been  initialized  and  all  events  scheduled,  j 

the  "...  'Start  Simulation'  statement  begins  the  simulation  by  passing 
control  to  the  timing  routine  which  removes  the  first  event  notice 
from  the  event  set  and  executes  that  event."  (Ref.  28:210) 

Next,  statistical  generation  of  job  input  parameters  is  accomplished 
in  the  I . JOB.REQUEST  (interactive  job  request)  and  B. JOB. REQUEST  (Batch 
job  request)  events.  The  individual  job  CPU  time,  l/O  time,  number  of 
l/O's  per  job  and  carriage  return,  core  requirements  and  job  priority  are 
generated  within  these  two  events.  Ihe  job  priority  is  determined  by 
I the  algorithm  shown  in  Table  1.  and  is  based  soley  upon  the  amount  of 

I 

core  requested,  CPU  time,  and  tape  mount  and  disc  mount  requests.  Once 

I 
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the  job  chairacteristic  parameters  and  job  priority  have  been  determined 
the  'job  is  placed  into  the  j ob. queue ' . The  individual  "job"  entity 
has  certain  job  attributes  (job  parameter  values  listed  in  the  preajnble) 
which  are  "moved"  with  the  job  throughout  the  simulation  process  by  use 
of  pointers  when  referencing  the  job.  Once  the  job  has  been  placed  into 
the  job  queue  the  job  scheduler  scans  the  jobs  in  the  queue  to  determine 
which  job  will  get  access  to  the  central  memory. 

"...  the  job  scheduler  chooses  a small  subset  of  the  jobs  submitted 
and  lets  them  'into'  the  system.  That  is,  the  job  scheduler  creates 
processes  for  these  jobs  and  assigns  the  processes  some  resources. 

It  must  decide,  for  example,  which  two  or  three  jobs  of  the  100 
submitted  will  have  any  resources  assigned  to  them.  The  process 
scheduler  decides  which  of  the  processes  within  this  subset  will  be 
assigned  a processor,  at  what  time,  and  for  how  long."  (Ref.  14:211) 

This  processing  manaigement  checks  to  see  if  necessary  devices  are  available, 
creates  processes  for  the  job  and  then  assigns  memory  idien  possible.  The 
request  for  memory  is  baised  upon  the  job  class  priority  (and  the  priority 
within  a class) , and  when  memory  is  available  the  job  is  loaded  into 
core  to  wait  for  access  to  the  CPU. 

The  job  is  now  in  the  CPU. queue;  this  queue  being  formed  every 
time  the  CPU  is  not  available  for  job  processing.  When  the  CPU  is  released 
a job  is  selected  from  the  CPU. queue  according  to  the  highest  class 
priority  and  scheduled  for  the  event  " CPU. processing" . 

Once  the  job  is  processing,  the  job  may  be  pre-empted  by  an  interrupt 
that  is  initiated  by  a job  which  has  a higher  priority.  At  this  point 
the  CPU  is  released  and  the  current  job  being  processed  is  placed  back  into 
the  "CPU. queue"  and  the  higher  priority  job  is  scheduled  for  the  event 
"CPU . processing" . If  the  job  being  processed  initiates  an  l/O,  the  job  will 
release  the  CPU  and  the  l/O  task  completed  before  the  job  is  re-inserted 
into  the  CPU. queue  (unless  of  course  this  is  the  job's  last  l/O). 
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If  the  job  has  completed  its  last  l/O  then  as  soon  as  the  l/O 
buffer  is  empty,  the  core  relinquished  by  the  job  may  be  used  by  auiother 
job  waiting  in  the  CK  queue.  Vhen  the  l/O  channel  becomes  free,  the  job 
will  be  placed  in  an  output  queue  and  appropriately  disposed  to  the 
output  terminal.  After  the  job  has  been  disposed  to  the  output  terminal 
the  output  disc  is  released  for  use  by  other  jobs. 

Ihe  event  AN. DISC. RELEASE  "destroys"  the  individual  batch  or  interactive 
job  but  tallies  up  statistics  about  the  job  before  removing  the  job's 
attributes  from  the  simulation  program. 

Data  Preparation 

In  order  to  simulate  the  job  processing  accurately  there  are  five 
major  parameters  that  must  be  generated  for  each  job;  namely,  the  amount 
of  l/O  time,  the  need  for  discs,  the  need  for  tapes,  the  amount  of  core  needed 
and  the  amount  of  CPU  time  to  process  the  job.  To  insure  that  sample  jobs 
with  these  characteristics  are  selected  and  truly  represent  the  job 
workload  of  the  system  would  require  an  additional  effort.  This  additional 
effort  would  require  more  time  than  has  been  allocated  for  this  study. 

It  was  decided  that  the  average  statistics  of  S2K  jobs  (AMIS  major  production 
jobs,  which  account  for  about  25%  of  the  workload)  would  be  adequate  to 
drive  this  simulation  model.  In  the  recommendations  section  of  this  paper 
the  use  of  cluster  analysis  will  be  presented  as  a means  for  acquiring  sample 
Inputs  which  would  reduce  the  amount  of  data  that  would  need  to  be  read  into 
the  program  during  execution. 

Programming  Techniques 

The  discrete-event  capability  of  Simscript  II. 5 allows  for  good 
modularity  throughout  the  simulation  program.  Modularization  allowed  the 
program  to  be  divided  into  subprograms  (events)  which  are  called  wl:ien  the 
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timing  routine  has  been  scheduled.  With  this  type  of  event,  the  need  for 
global  flaigs  was  minimized  aind  consequently  the  hierarchical  structure  was 
forced  to  simulate  the  real  system  more  realistically.  Without  the  need 
for  global  flags  the  modifiability  of  the  model  is  enhanced  as  well. 

Modifiability  implies  that  when  changes  are  made  to  a portion  of  the 
Iirograjii  the  changes  in  one  subroutine  make  few,  if  ainy,  changes  in  other 

subroutines  throu^out  the  prograju.  This  modifiability  principle  was 

/ 

used  in  constructing  the  two  events  which  "parameterize"  the  batch  and 
interactive  jobs. 

The  internal  generation  of  job  attributes  can  easily  be  changed  to 
read  data  as  is  recommended  in  the  last  chapter  of  this  paper.  This 
simplifies  the  transition  process  of  modifying  the  program.  Another 

way  to  make  changes  easy  is  to  use  the  principle  of  understandability . 

* 

Simscript  II. 5 allows  variables,  routines  and  events  to  be  in 
alphan\imeric  form  and  hence  the  principle  of  understandability  is 
strengthened  throughout  the  program.  As  long  as  the  first  5 characters 
differ,  there  is  no  confusion  when  transferring  execution  from 
one  event  to  another  event  or  routine.  This  means  that  the  names  of 
the  labels,  events,  routines  and  variables  can  be  spelled  out  in  a 
string  of  short  works  separated  by  "periods"  when  coding  the  progrcun. 
Understandability  is  not  merely  a property  of  legibility  as  the  entire 
conceptual  structure  of  the  model  is  involved. 

The  principles  of  modularity,  modifiability,  and  -understandability 
were  achieved  by  using  top  down  design  methods  in  building  the  conceptual 
structure  of  this  model.  After  the  model  had  been  built  and  debugged  it 
was  necessary  to  validate  the  model. 
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; Validation  of  the  Slmscrlpt  Model 

i 

The  investigator  must  have  confidence  in  inferences  aind  results 
I obtained  from  the  computer  simulation  results.  Confidence  in  the  model 

* is  established  if  the  results  from  the  model  oompcire  favorably  and 

accurately  with  the  real  system  results.  If  the  model  results  vary 
drastically  from  those  of  the  real  system  then  changes  must  be  made  in 
the  structiare  of  the  model  or  in  the  variables  and  parameter  estimates 
that  were  used  in  the  data  preparation  stage.  If  the  model  is  simple 
to  understand  and  easy  for  the  user  to  control  and  manipulate  then 
i the  analyst  will  find  this  iterative  process  manageable. 

Before  inferences  can  be  drawn  from  the  validated  model,  certadn 

.i 

' assumptions  must  be  made.  The  input  parameters  must  be  as  accurate  as 

! ' 

possible  before  a high  level  of  confidence  in  the  simulation  results  is 
; achieved. 

Assumptions 

The  AMIS  jobs  accoimt  for  90^  of  the  workload  on  the  IBM  370/l55 

/ 

and  during  a given  typical  week  of  November  1977 • this  would  amount  to 
about  12-15  hundred  jobs.  During  this  period  the  S2K  jobs  (one  type 
of  AMIS  major  production  jobs)  accounted  for  about  one  quarter  of  the 
jobs  rtin.  Data  about  these  jobs  revealed  that  of  266  interactive  jobs 
an  "average"  job  initiated  2,739  l/O's  needing  about  44  CPU-seconds 
for  execution  of  the  job.  The  typicaJ.  batch  job  initiated  1,215  l/O's 
with  about  34  CPU-seconds  required.  Since  l/O  time  is  not  needed  to 
determine  class  priorities  it  is  difficult  to  ascertain  close  figures 
for  job  l/O  times.  It  is  known  that  the  AMIS  jobs  are  hi^ly  l/O  bound 
jobs  because  of  the  time  needed  to  search  the  data  bases,  so  a ratio  of 

I 20  to  1 was  chosen  to  represent  l/O  vs  CPU  time  for  batch  jobs,  and  a 

5 
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ratio  of  4.5  to  1 for  interactive  jobs  was  selected.  Assuming  that  the 
S2K  jobs  axe  "representative"  of  the  workload,  a 4 hour  simulation  time 
period  was  choosen  to  gather  data  for  analysis.  Again,  the  time  allocated 
for  this  study  did  not  allow  for  an  analysis  of  data  beyond  this  single 
4 hour  period. 

Assuming  that  28  batch  jobs  are  scheduled  during  this  4 hour  period 
and  that  40  interactive  jobs  are  run  during  this  sarnie  time,  the  initial 
base  line  run  showed  that  the  model  reflected  a CPU  utilization  of  87% 
with  a gain  factor  of  2.3684.  (See  definition  of  gain  factor  on  page  22) 
This  compares  favorably  with  the  85%  CPU  utilization  of  the  real  system. 
Appendix  A contains  information  about  the  baseline  run  and  all  other 
subsequent  runs  as  well.  The  results  were  calculated  after  a warm-up 
period  that  is  explained  and  discussed  next. 

Steady  State 

Shannon  /z^  recommends  three  possible  ways  to  eliminate  errors 
from  simulation  results  that  were  introduced  early  in  the  warm-up 
of  the  system  simulation. 

1)  Run  the  computer  simulation  long  enough  and  the  initial  warm-up 
errors  will  be  absorbed  into  an  accurate  average  of  the  collected 
statistics. 

2)  Choose  initial  starting  conditions  that  accurately  reflect  a given 
operating  time  period  which  will  reduce  errors  during  this  transient 
period. 

3)  Throw  out  or  just  don't  calculate  statistics  during  an  appropriate 
warm-up  period. 

The  method  employed  in  this  investigation  merged  ideas  from  both 
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Fig  7.  Transient  Wai’m-up  Period 


points  2 and  3 above.  The  initial  starting  conditions  provided  for  the 
scheduling  of  1 interactive  and  6 batch  jobs  to  begin  execution  within 
the  first  minute  of  simulation  time.  After  about  5 3A  minutes  a flag 
was  set  to  start  gathering  job  workload  statistics  (?ig-jre  7 on  page  37)  . 
shows  what  effect  this  warm-up  period  had  on  CPU  utilization  of  the  system. 

The  time  interval  between  "start  simulation"  and  time  A represents  30  seconds, 
while  between  point  A and  C each  division  represents  a 20  second  time 
intervail.  The  cumulative  CPU  utilization  remains  fairly  constant  up  to 
point  B (approximately  5 3A  minutes)  when  a flag  was  set  to  recalculate 
the  cumulative  CPU  utilization.  Prom  point  B to  point  C it  can  be  seen 
that  larger  variations  occurred.  Point  C represents  that  last  regular 
interval  where  the  CPU  utilization  was  calculated  and  every  point  beyond 
C reveals  the  cumulative  CPU  utilization  when  any  given  job  was  completed. 

It  appears  that  the  cumulative  CPU  utilization  stabilizes  in  the  nei^borhood 
of  about  82  to  88  percent.  By  using  point  B as  the  starting  reference 
point  the  results  are  weighted  negatively  by  the  rapid  drop  from  point 
B to  point  C;  but  the  immediate  rise  following  point  C will  help  keep  this 
negative  weight  factor  from  drastically  altering  the  true  cumulative 
CPU  utilization  as  it  stabilizes  near  the  end  of  the  U hour  run.  Now  that  the 
transient  warm-up  errors  have  been  reduced,  a sensitivity  analysis  of  the 
model's  predictions  is  appropriate. 

Sensitivity  Analysis 

A total  of  29  simulation  runs  were  made  to  determine  >Aiat  effect 
changes  in  l/O  and  CPU  time  would  have  on  the  cvimulative  CPU  utilization 
and  gadn  factor.  The  first  graph  on  page  52  shows  how  a workload  increase 
will  affect  the  two  variables.  In  this  graph,  as  well  as  the  rest  of  the 
graphs  in  Appendix  A,  the  vertical  dashed  line  represents  the  initial  base 


line  run.  It  appears  that  the  gain  factor  is  on  a steady  increase  and 
that  CPU  utilization  falls  below  the  baseline  value  of  36.67^  aifter 
an  increase  of  10^  in  CPU  and  l/C  time. 

The  chart  on  page  53  shows  what  effect  an  increase  in  l/O  and  decrease 
in  CPU  time  will  have  on  the  variables  of  interest.  The  CPU  utilization  is 
on  a "saw  tooth"  decline  while  the  gadn  factor  is  also  on  a "saw  tooth" 
rise  which  seems  to  rise  when  the  CPU  utilization  decreases  and  falls 
when  the  CPU  utilization  increases.  The  baseline  run  is  at  a relative 
maximum  point  for  CPU  utilization  with  a slightly  lower  than  average 
point  cailculated  for  the  gain  factor. 

The  chart  on  page  54  shows  what  effect  changes  in  CPU  time  will 
have  on  the  results  when  the  l/O  time  is  kept  constant.  The  CPU 
utilization  is  on  a constant  rise  until  the  CPU  time  exceeds  the  neigh- 
borhood of  a plus  10%  increase,  at  which  point  the  CPU  utilization  decreases 
by  about  Ibe  baseline  run  reveals  that  a 10^  increase  in  the  CPU 

time  of  all  jobs  run  will  allow  a 3%  increase  in  CPU  utilization  while 
decreasing  the  gain  factor  by  a factor  of  .2927. 

The  results  on  page  55  show  that  vdien  keeping  the  CPU  time  (of  all 
the  jobs  run)  constant, an  increasing  change  in  l/o  time  will  cause  a gradual 
decrease  in  CPU  utilization  until  going  beyond  a 10%  increase,  where  this 
variable  drops  sharply.  The  gain  factor  remains  on  a gradual  increase 
throughout  the  increase  in  l/O  time. 

The  next  eight  charts,  page  56  through  page  53  show  variations  of 
changing  the  CPU  and  l/O  times  of  all  the  jobs  run.  These  graphs  reveal  that 
the  baseline  run  is  slightly  below  or  generally  above  the  CPU  utilization 
figures  and  that  the  gain  factor  is  at  about  the  mid-point  or  generally 
below  the  values  calculated  for  the  other  runs.  These  sensitivity  runs 
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show  that  there  are  only  two  main  areas  vrtiere  the  results  vary  drastically 
from  the  baseline  run.  The  first  is  where  the  CPU  time  is  increased  by  lOJ? 
and  the  l/O  time  decreased  by  10%;  the  other  is  where  the  CPU  time  is 
decreased  by  20^  and  the  l/O  time  is  increased  by  20%.  Using  the  workload 
characteristics  of  the  baseline  run  10  more  computer  runs  were  made  to  help 
in  the  analysis  of  the  batch-interactive-mix  problem. 

Experimentation 

Two  separate  single  factor  experiments  were  run  to  determine  what 
effect  an  increase  in  the  interactive  workload  and  a decrease  in  the  number 
of  ports  available  to  run  interactive  jobs  would  have  on  the  CPU  utilization 
and  gain  factor.  Five  runs  were  made  for  each  investigation  with  the  intent 
that  each  separate  run  be  considered  a datum  point. 

The  chart  on  page  64  shows  that  as  the  number  of  interactive  jobs  increase 
up  to  an  additional  50%  Increase  in  the  Interactive  workload,  the  CPU 
utilization  climbs  by  about  8 l/2%  while  the  gain  factor  decreases  by  a factor 
of  .9550 • There  is  every  indication  that  the  CPU  could  be  used  more 
efficiently  but  at  the  saune  time  it  appears  that  a decline  in  throughput  and 
tumaroiind  performance  might  result . The  results  on  page  65  ( decrease  in 
the  number  of  Interactive  terminals)  are  just  about  the  saune  as  those  obtained 
^en  increasing  the  interactive  workload.  ] 

Allowing  just  one  factor  (i.e.,  number  of  interactive  terminals  or  i 

interactive  workload)  to  vary  with  each  computer  run  provides  a limited 
euialysis  concerning  throughput  performance.  Ideally  interactions  between  , 

these  two  factors  should  be  considered  but  due  to  the  cost  of  each  computer  I 

* ^ 

run  it  wais  decided  that  these  extra  runs  would  not  be  made.  As  an  aid  to  ’ 

the  reader,  the  following  discussion  will  help  in  the  determination  of  the  ; 
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important  factors  and  the  number  of  r’ms  to  be  made  during  a simulation 
analysis . 

Desi3:n  Tradeoffs 

Shannon  (Section  ^.6)  discusses  a tradeoff  study  between  the 

number  of  factors  (input  parameters,  or  variables),  number  of  factor  levels, 
number  of  replications  and  total  number  of  computer  runs  required.  Figure  6 
shows  a nomograph  and  the  dark  arrows  show  how  to  enter  the  nomograph  and 
calculate  the  expected  total  computer  cost  and  number  of  computer  runs  to 


be  made. 


Where 

k = number  of  factors  (input  parajneters  or  variables) 

q = number  of  factor  levels 

p = number  of  replications 

M = total  number  of  computer  runs  required  . 

In  section  4.6,  Shannon  lists  3 equations  for  determining  which  of  the 
variables  (level,  factor  and  replication)  are  the  most  dominant.  By 
knowing  which  variable  is  dominant  the  investigator  can  save  computer  costs 
by  decreasing  the  level  (if  applicable  - without  altering  results  drastically) 
of  the  dominant  variable.  For  example,  if  the  number  of  factors  is 
unquestionably  the  dominant  variable  then  the  number  of  factors  might  be 
reduced  to  save  computer  runs  without  sacrificing  the  importance  of  the 
results. 

The  "Pareto  Principle"  allows  the  investigator  to  reduce  the  number 
of  factors  because  this  principle  revolves  around  the  thesis  that  most 
systems  need  roughly  20%  of  the  factors  to  explain  nearly  30%  of  the 
simulation  results,  and  that  the  remaining  80%  of  the  factors  account  for 
only  20f%  of  the  observed  performance  of  the  system.  (Ref.  28:153)  The 
difficult  part  of  reducing  the  number  of  factors  is  to  actually  determine 
which  factors  can  be  placed  in  the  20f%  high  performance  category. 

The  two  main  factors  in  the  AMIS  simulation  are  the  amount  of  CPU 
and  l/O  time,  and  the  response  variables  of  experimental  interest  are 
the  gain  factor  auid  the  CPU  utilization.  These  response  variables  allow 
an  analysis  of  the  throughput  performance  to  be  conducted,  but  as  was 
mentioned  previously,  much  more  needs  to  be  done  in  determining  accurate 
input  data  which  means  that  the  CPU  and  l/O  factors  will  produce  more 


reliable  results. 


V Recommendations 


The  results  from  this  simulation  effort  are  not  extremely  accurate 
and  could  be  improved  by  the  use  of  cluster  analysis.  The  input  data, 

* which  currently  is  made  up  of  about  25%  of  the  workload,  could  become  more 

representative  of  all  types  of  AMIS  jobs  if  cluster  analysis  were  to 
be  employed  in  gathering  data. 

• Cluster  Analysis 

Cluster  analysis  is  a statistical  tool  for  analyzing  data  by 
developing  a data  classification  or  identification.  The  end  result  of 
this  classification  or  identification  is  a collection  of  data  that 
has  a high  "natural  association"'  among  members  of  the  same  group, 

! aind  a much  lesser  association  between  d^ita  that  has  been  clustered 

into  different  groups.  There  are  several  clustering  strategies  but  all 
methods  involve  two  primary  considerations.  "First,  there  is  defined 
► a measiire  of  group-density  or  of  inter-group  likeness.  Examples  of 

the  latter  type  of  measure  (The  so-called  'similarity  coefficients').... 
Secondly,  the  chosen  measure  has  to  be  incorporated  into  a "sorting" 
strategy"  vdiereby  groups  of  elements  are  extracted."  (Ref.  3*373) 

Anderberg  (3)  briefly  discusses  four  metrics  that  might  be  used  for 
clustering  data  which  include  the  so-called  "city  block"  metric,  the 
Ghebychev  metric  and  the  familiar  Euclideaui  distance  metric. 

Euclidean  Metric 

Anderberg  states  that  any  metric  must  satisfy  the  following 
conditions* 

1)  D(X,Y)  =0  if  and  only  if  X = y 

- 2)  D(X,Y)  > 0 for  all  X and  Y in  E, 

I 3)  D(X,Y)  = D(Y,X)  for  all  X and  Y in  E, 

) 

! 

^3 

I 
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4)  D(X,Y)  < D(X,Z)  + D(y,Z)  for  all  X,  Y,  and  X in  E. 
where  S is  the  symbolic  representation  for  a measurement  space 
and  X,  Y and  Z are  any  three  points  in  the  space  E. 

The  distance  between  data  units  can  be  found  by  using  the  Euclidean 
metric  where 


1/2 


When  dealing  with  two  or  more  variables,  the  Euclidean  metric  can 
be  used  to  determine  clusters  eind  Green  (l8)  has  suggested  the 
following  steps  to  compute  the  cliister  analysis.. 

1.  Each  variable  (characteristic)  is  transformed  so  that  the  data 
becomes  a standard  distribution  where  the  following  rule  converts  the 
variables  to  a standardized  variate  with  zero  mean  and  standard  deviation. 


z = 


(4) 


2.  Distances  between  all  possible  pairs  of  data  are  calculated  using 
the  Euclidean  Metric. 

3.  The  pair  of  data  points  with  the  smallest  distance  between  the  two 
points  is  chosen  as  the  initial  node  of  the  first  cluster  and  the 
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centroid  of  this  pair  is  calculated. 

"Additional  points  are  added  to  this  cluster  (based  on  "closeness" 
to  the  last-computed  average)  until: 

A.  Some  pre-specified  number  of  points  has  been  clustered,  or 
3.  Ihe  point  to  be  added  to  the  cluster  exceeds  some  pre- 
specified distance  cutoff  number."  (Ref.  18:391) 

Anderberg  suggests  using  the  single-linkaige  method  for  accomplish- 
ing this  step  in  the  cluster  analysis.  As  new  data  points  are  added 
to  the  cluster  the  new  centroid  must  be  calculated  so  point  B is  not 
violated  when  an  additional  point  is  merged  or  "linked"  to  the  cluster. 

If  it  is  known  that  a certain  distance  between  the  cluster  centroid  and 
the  nearest  neighbor  can  not  exceed  some  "coarsening  paramieter"  value, 
then  a new  cluster  node  is  established  by  determining  another  centroid 
between  the  points  with  the  smallest  distance  between  them. 

5.  Ihe  new  additional  cluster  node  is  used  as  a basis  for  building  a new 
cluster  and  step  4 is  repeated. 

6. -  To  avoid  artifically  fine  distinctions  between  clusters,  the  points 
may  be  allowed  to  be  in  more  than  one  cluster. 

Ihe  above  has  been  a general  discussion  of  cluster  analysis;  the 
section  that  follows  will  suggest  a practical  use  of  this  technique 
to  refine  the  data  generation  employed  to  drive  the  simscript  model. 


Practical  Use  of  Cluster  Analysis 


A computer  program  ;rtiich  performs  a cluster  analysis  on  the  input 
data  (i.e.,  CPU  time,  amount  of  core  needed,  and  need  for  tape  or  disc 
mounts)  should  he  the  next  step  in  the  analysis  of  the  system.  Forgy  (31) 
has  provided  a simple  algorithm  consisting  of  the  following  steps,  which 
will  help  in  such  a cluster  analysis; 

1)  Since  there  axe  11  general  priority  class  memberships,  the 

end  points,  for  both  CPU  time  and  core  needed,  within  each  class 
should  become  the  initial  cluster  boundaries. 

2)  Using  the  two  end  points  in  conjunction  with  the  mid-point  of  the 
boundaries  as  seed  points,  run  the  data  to  aillocate  each  data  unit 
to  the  cluster  with  the  nearest  seed  point.  The  seed  points  must 
remain  fixed  during  the  complete  run. 

3)  Compute  new  seedpoints  from  the  centroids  of  the  resulting  clusters. 

4)  Re-run  the  data  using  the  new  centroids  as  seed  points  and  repeat 
this  iteration  until  no  data  units  change  their  cluster  membership. 

It  is  not  certain  just  how  many  runs  will  have  to  be  made  before  this 
iterative  process  stabilizes,  hut  Forgy  suggests  that  from  empirical 
evidence,  this  will  ordinarily  be  accomplished  within  the  first  5 runs. 

When  the  cluster  analysis  has  been  completed  there  should  be  a 
total  of  33  separate  clusters,  each  having  a centroid  value  for  the  CPU 
time  and  core  needed.  The  frequency  of  each  cluster  (the  number  of  data 
points  within  each  cluster)  should  be  calculated  and  the  data  is  then 
ready  to  drive  the  simulation  model. 

Now  that  the  data  has  been  reduced  to  33  clusters  that  accurately 
represent  the  workload  characteristics  of  the  system,  the  "centroid  data" 
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can  be  called  at  random  according  to  the  frequency  of  each  cluster. 

This  means  that  33  data  points,  called  at  random,  will  suffice  for 
valid  inputs.  This  is  much  superior  to  the  internal  generation  of 
values,  the  method  employed  to  this  point,  and  allows  the  analyst 
to  include  data  from  all  the  different  types  of  jobs  run  on  the  system. 


VI  Conclusions 

1)  Using  the  data  about  32K  major  production  jobs,  the  baseline  simulation 
run  generated  a CPU  utilization  of  36.6?%.  This  value  is  close  to  the 
real  system  CPU  utilization  of  85^  but  it  must  be  remembered  that 

the  results  in  this  investigation  were  not  obtained  from  using  data 
established  by  the  cluster  analysis  technique  in  the  previous  section. 

In  most  of  the  charts  in  Appendix  A the  baseline  figure  for  the  CPU  utilization 
was  higher  than  the  neighboring  datum  points  (sensitivity  runs) . It  appears 
that  although  the  CPU  utilization  increased,  during  a few  sensitivity  runs, 
there  was  a consistent  decline  in  the  gain  factor  (degradation  of  -ttie 
throughput  or  turnaround  time) . From  the  results  obtained,  it  appears  that 
a change  in  the  workload  will  allow  an  increase  in  the  CPU  utilization  but 
poorer  turnaround  results  will  also  arise  for  the  AMIS  users.  . 

2)  The  graph  on  page  64  supports  the  idea  that  the  gain  factor  lowers 
in  conjunction  with  an  increase  in  CPU  utilization.  The  gadn  factor 
for  the  baseline  run  was  at  a maximum,  decreasing  substauitially  an 
the  interactive  workload  increased. 

3)  Never  once  were  there  35  or  more  interactive  terminals  being  used 
concurrently.  From  the  results  obtained  and  shown  on  page  65  a decrease 
in  the  number  of  interactive  ports  causes  a 2%  increase  in  CPU  utilization, 
while  at  the  same  time  causing  the  gain  factor  to  fall  by  .3935« 

4)  If  real  time  updates  were  to  be  performed  interactively  by  AMIS 
users,  the  simulation  results  predict  that  throughput  performance  would 
probably  decline.  This  premise  is  drawn  from  the  data  on  page  64  where 
an  increase  in  the  interactive  workload  reflects  a steady  decline  in 
the  gain  factor. 
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APPENDIX  A 
CPU  Utilization  vs  Gaiin  Factor 
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IMs  computer  prograun  in  Appendix  B took  nearly  60  seconds  to  compile 
and  each  run  took  about  l85  seconds  for  execution  time.  The  ratio  of 
simulation  time  to  execution  time  (time  to  execute  each  run)  is  approx- 
imately 80  to  1.  (it  took  185  seconds  to  execute  the  simulation  program 
for  a simulated  period  of  4 hours) . The  model  itself  is  comprised  of 
23  separate  events  (sitnilax  to  subroutines)  beginning  with  the  preaunble. 

The  different  job  attributes,  events  and  pajrajneters  are  established 
in  the  preamble  while  the  actual  values  are  established  in  the  main 
routine.  The  variables  of  interest  in  each  event  are  given  a 
short  dlscription  within  the  event  that  they  occur.  Generally  these 
discriptions  are  placed  between  two  lines  of  "stars"  which  are  physically 

t ' 

^ placed  near  the  beginning  of  each  event  or  routine. 
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